Nature Computational Science
○ Springer Science and Business Media LLC
Preprints posted in the last 30 days, ranked by how well they match Nature Computational Science's content profile, based on 50 papers previously published here. The average preprint has a 0.05% match score for this journal, so anything above that is already an above-average fit.
Huang, Q.; Mourra-Diaz, C. M.; Wen, X.; Payette, D.
Show abstract
AO_SCPLOWBSTRACTC_SCPLOWPhylogenetic inference remains computationally challenging due to the exponentially growing tree topology search space, and current methods rely heavily on multiple sequence alignments (MSAs) which are expensive and error-prone. We propose AncestorGFN, a proof-of-concept approach leveraging Generative Flow Networks (GFlowNets) for simultaneous sequence generation and phylogenetic exploration without requiring explicit MSAs. Our method learns to generate sequences matching a target distribution while the flow trajectories implicitly encode structural relationships among sequences. We demonstrate that greedy traceback on maximum-flow trajectories recovers shared intermediate states suggestive of common ancestry, and evaluate on the let-7 microRNA family where the learned flow structure qualitatively captures phylogenetic branching patterns. Furthermore, beam search at inference time discovers novel sequences clustering near known targets, suggesting applications in de novo sequence design. This work establishes an initial foundation for alignment-free phylogenetic exploration using generative models.
Lteif, D.; Jia, S.; Bit, S.; Kaliaev, A.; Mian, A. Z.; Small, J. E.; Mangaleswaran, B.; Plummer, B. A.; Bargal, S. A.; Au, R.; Kolachalama, V. B.
Show abstract
Structural magnetic resonance imaging (MRI) is a cornerstone for diagnosing neurological disorders, yet automated interpretation of multi-sequence brain MRI remains limited by challenges in cross sequence reasoning and protocol variability. Here we present ReMIND, a vision-language modeling framework tailored for comprehensive multi-sequence and multi volumetric brain MRI analysis. Trained on over 73,000 deidentified patient visits encompassing more than 850,000 MRI sequences paired with radiology reports from diverse clinical and research cohorts, ReMIND combined large scale instruction tuning on more than one million clinically grounded question answer (QA) pairs with targeted supervised fine-tuning for radiology report generation. At inference, ReMIND employed modality aware reranking and correction, a report level decoding strategy that suppressed unsupported modality claims while preserving linguistic fluency and clinical coherence. Cross-cohort generalization was maintained on independent external datasets from different institutions. These findings represent an advance toward consistent and equitable brain MRI interpretation, meriting prospective evaluation to support diagnosis and management of neurological conditions.
Li, K.; Hou, Y.; Mukherjee, B.; Pitzer, V. E.; Weinberger, D. M.
Show abstract
Household transmission studies are important for understanding infectious disease transmission and evaluating interventions; however, they are frequently constrained by methodological challenges, including in study design and sample size determination, and in estimating parameters of interest after collecting the data. Existing tools often lack flexibility in modeling age-specific susceptibility, infectivity patterns, and the impact of interventions such as vaccination or prophylaxis. Here, we develop HHBayes, an open-source R package that provides a unified framework for simulating and analyzing household transmission data using Bayesian methods. The package enables researchers to: (1) simulate realistic household transmission dynamics with highly customizable variables; (2) incorporate viral load data (measured in viral copies/mL or cycle threshold values) to model time-varying infectiousness; (3) estimate age-dependent susceptibility and infectivity parameters using Hamiltonian Monte Carlo methods implemented in Stan; and (4) evaluate intervention effects through user-defined covariates that modify susceptibility or infectivity. We demonstrate the capabilities of the package through simulation studies showing accurate parameter recovery and applications to seasonal respiratory virus transmission, including the impact of vaccination and antiviral prophylaxis on household attack rates. HHBayes addresses a critical gap in infectious disease epidemiology by providing researchers with accessible tools for both prospective study design and retrospective data analysis. The flexibility of the package in handling complex household structures, time-varying infectiousness, and intervention effects makes it valuable for studying diverse pathogens.
Fei, P.; Dustin, M. L.
Show abstract
Upon T cell receptor (TCR) engagement, a T cell forms an immunological synapse (IS) with an antigen-presenting cell (APC), which can be mimicked by purified ligands on supported lipid bilayers (SLBs)1,2. Microvilli actively scan the surface; upon initial engagement, F-actin-dependent TCR microclusters form, and the central supramolecular activation cluster (cSMAC) sustains TCR signaling in CD8 T cells3,4. Although signaling activities within the IS have been observed qualitatively through total internal reflection immunofluorescence microscopy5-7, the stoichiometric relationships among the components of the TCR signalosome remain unknown. In this study, we employed a two-step approach to quantify the components of the TCR signalosome. First, Jurkat cell lines expressing GFP-tagged proteins on a knockout background were used to calibrate fluorescence intensity (IF) signals against molecular copy numbers, based on measurements of single-tag signals and multiple corrections. In the second step, this calibration was applied to determine the stoichiometries of key TCR signalosome components, including TCR, CD8, CD28, CD45, PD-1, Lck, ZAP-70, LAT, and PLC{gamma}1, across scanning, early activation, and sustained activation states in human primary T cells. We refer to the method as quantitative extrapolation from single-tags (QuEST) immunofluorescence microscopy. Applying the QuEST, we were surprised to find that the ZAP-70:TCR ratio in microclusters and the cSMAC was 1:1, far from the potential 10:1 ratio. Nanoscale structures of the TCR signalosome were further captured using direct stochastic optical reconstruction microscopy (dSTORM), confirming that ZAP-70 was strongly co-localized with the TCR. Moreover, we applied QuEST to confirm the presence of T cell intrinsic CD28 recruitment, independent of CD80 or CD86 on SLBs, during TCR activation. This T cell intrinsic CD28 recruitment could be disrupted through engagement of PD-1 with PD-L1 on SLBs. This shows that PD-1 engagement can disrupt T cell intrinsic CD28 costimulation. QuEST provides a broadly applicable pipeline for quantitative analysis of TCR signalosomes in human primary cells, enabling a quantitative basis for the rational manipulation and engineering of the TCR signalosome in immunotherapies.
Chauveau, M.; Kleeorin, Y.; Hinds, E.; Junier, I.; Ranganathan, R.; Rivoire, O.
Show abstract
High dimensionality and multiscale statistical structure are pervasive features of biological data, posing fundamental challenges for modeling. Because model inference generally proceeds with far fewer data than parameters, statistical patterns across scales are often unevenly represented. Protein sequences provide a paradigmatic example: statistics across homologs are inherently multiscale, displaying collective correlations among conserved residue sectors that encode function, alongside localized correlations corresponding to physical contacts outside these sectors. Standard regularization strategies used to mitigate undersampling during model inference have been shown to capture these patterns unevenly, a bias that compromises generative models of protein sequences by limiting their ability to produce both functional and diverse proteins. This limitation is exemplified by Boltzmann machine-based generative models, which so far have required post hoc corrections to recover functionality, at the cost of reduced sequence diversity. Here, we introduce the stochastic Boltzmann Machine (sBM), a new regularization strategy that more accurately captures different correlation scales. Through analyses of theoretical models with known ground-truth parameters and experiments on the chorismate mutase family, we show that sBM effectively mitigates distortions in the estimation of model parameters, enabling the generation of functional sequences with greater diversity and without the need for post hoc corrections. These results advance the inference of generative models that more faithfully reflect the evolutionary constraints shaping protein sequences.
Xu, M.; Schmidt, A.; Zhang, Q.
Show abstract
Recent advances in single-cell-resolution spatial transcriptomics enable the profiling of gene expression while preserving the precise locations of individual cells, enabling quantitative investigation of how cellular organization relates to molecular state. A fundamental yet under-modeled aspect of organization is local cell density, which varies across microenvironments and can be linked to transcriptional programs. However, rigorous computational frameworks to quantify density-expression correlations remain lacking. Here, we present DenMark (Density-dependent Marked point process framework), a unified statistical framework that jointly models local cell locations and gene expression in single-cell-resolution ST data, enabling identification of density-correlated genes while naturally providing uncertainty quantification. To scale inference, DenMark leverages a Hilbert space Gaussian process approximation. In simulations, DenMark provides an accurate and well-calibrated estimate of density-expression association. Across single-cell ST platforms, including MERFISH and 10x Xenium, and across brain and cancer tissues, DenMark identifies genes whose expression is associated with cellular clustering and reveals density-related biological programs.
Dibble, A.; Dalby, C.; Sevegnani, M.; Fracasso, A.; Lyall, D. M.; Harvey, M.; Svanera, M.
Show abstract
Precision neuroimaging aims to deliver individualized assessments of brain health, yet a single structural MRI does not yield a multidimensional, quantitative summary of an individual's current health or future risk. Existing approaches optimize task-specific objectives, yielding representations entangled with cohort- or disease-specific signals rather than capturing biologically grounded patterns of anatomical variation. Here, we introduce NeuroFM, a foundation model trained exclusively on 100,000 healthy synthetic volumes to predict morphometric and demographic targets. Without exposure to diagnostic labels, NeuroFM organizes brain MRIs into population-level patterns that encode meaningful brain health differences. These representations transfer across five neuroscience domains without adaptation and support simple linear readouts for clinical, cognitive, developmental, socio-behavioural, and image quality control. Evaluated on 136,361 real volumes spanning multiple cohorts, NeuroFM generalizes across domains and enables individual-level brain health profiling, estimating future dementia risk years before diagnosis. Together, these findings establish a disease-naive foundation model paradigm for precision neuroimaging.
Muller, B.; Ortiz Barranon, A. A.; Roberts, L.
Show abstract
Dysarthric speech severity assessment typically requires either trained clinicians or supervised machine learning models built from labelled pathological speech data, limiting scalability across languages and clinical settings. We present a training-free method (no supervised severity model is trained; feature directions are estimated from healthy control speech using a pretrained forced aligner) that quantifies dysarthria severity by measuring the degradation of phonological feature subspaces within frozen HuBERT representations. For each speaker, we extract phone-level embeddings via Montreal Forced Aligner, compute d scores along phonological contrast directions (nasality, voicing, stridency, sonorance, manner, and four vowel features) derived exclusively from healthy control speech, and construct a 12-dimensional phonological profile. Evaluating 890 speakers across10corpora, 5 languages for the full MFA pipeline (English, Spanish, Dutch, Mandarin, French) and 3 primary aetiologies (Parkinsons disease, cerebral palsy, amyotrophic lateral sclerosis), we find that all five consonant d features correlate significantly with clinical severity (random-effects meta-analysis rho = -0.50 to -0.56, p < 2 x 10^-4; pooled Spearman rho = -0.47 to -0.55 with bootstrap 95% CIs not crossing zero), with the effect replicating within individual corpora, surviving FDR correction, and remaining robust to leave-one-corpus-out removal and alignment quality controls. Nasality d decreases monotonically from control to severe in 6 of 7 severity-graded corpora. Mann-Whitney U tests confirm that all 12 features distinguish controls from severely dysarthric speakers (p < 0.001).The method requires no dysarthric training data and applies to any language with an existing MFA acoustic model (currently 29 languages) or a model trained from healthy speech alone. It produces clinically interpretable per-feature profiles. We release the full pipeline and phone feature configurations for six languages to support replication and clinical adoption. Author SummaryOne of the authors has lived with ALS for sixteen years. Bernard Muller, who built this entire analytical pipeline using only eye-tracking technology, has experienced the progression of the disease firsthand, including the dysarthric speech that comes with advancing ALS and the tracheostomy that followed. The problem this paper addresses is not abstract to him, and that shapes how the method was designed. We developed a method to measure how well a person with dysarthria can produce distinct speech sounds, without needing any recordings of disordered speech for training. Our approach works by analysing how a widely available AI speech model organises different sound categories -- such as nasal versus oral consonants, or voiced versus voiceless sounds -- and measuring whether those categories become harder to tell apart. We tested this on 890 speakers across 10 datasets in five languages, covering Parkinsons disease, cerebral palsy, and ALS. Because the method only needs healthy speech recordings to set up, it applies to any language with an existing acoustic model, currently covering 29 languages. The resulting profiles show clinicians which specific aspects of speech production are degrading, rather than providing a single opaque severity score. This could support remote monitoring of speech decline in neurodegenerative disease and enable screening in languages and settings where specialist assessment is unavailable.
Woody Santos, J. B.; Chen, L.; Miranda Quintana, R. A.
Show abstract
We present mdBIRCH, an online clustering method that adapts the BIRCH CF-tree to molecular dynamics (MD) data by using a merge test calibrated directly to RMSD. Each arriving frame is routed to the nearest centroid and added only if the post-merge radius computed from the cluster feature remains within a user-supplied threshold. This keeps the average deviation to each cluster centroid bounded as the cluster grows and preserves a simple interpretation of resolution in physical units. We evaluate mdBIRCH on a {beta}-heptapeptide and the HP35 system. We propose two protocols to make the threshold selection easier: (a) RMSD-anchored runs that use controlled structural edits to define interpretable operating points and (b) blind sweep that tracks how cluster count, occupancy, and coverage change with the threshold. In both systems, increasing the threshold reduces the number of clusters, concentrates coverage in high-occupancy states, and broadens within-cluster RMSD distributions. Furthermore, because decisions rely only on cluster summaries, mdBIRCH completely avoids the need for pairwise distance matrices, scales near-linearly with the number of frames on standard hardware, and naturally supports incremental operation. The method offers a practical combination of speed and interpretability for large-scale trajectory analysis.
Lu, R.; Liu, S.; Liu, Y.; Duncan, J.; Henson, R. N.; Woolgar, A.
Show abstract
Time-resolved neural decoding is widely used to track information represented in neural activity, but conventional linear decoders primarily capture phase-locked evoked responses and often fail to recover representations embedded in nonlinear or non-phase-locked dynamics, potentially limiting the interpretation of neural coding. Here, we introduce HeteroRC, a biologically inspired and interpretable decoding framework based on heterogeneous reservoir computing. HeteroRC projects neural signals into a high-dimensional recurrent state space with heterogeneous time constants, enabling nonlinear feature expansion and multiscale temporal integration directly from raw neural time series. Simulations demonstrate that HeteroRC significantly outperforms linear decoders and a suite of artificial neural networks (including RNNs, LSTMs, Transformers and EEGNet) on evoked responses while robustly capturing induced oscillatory power, phase synchrony, and aperiodic modulations--dynamics that are largely latent to conventional linear methods. We further validate HeteroRC on two empirical EEG datasets. In a motor imagery task, it substantially improves decoding accuracy and exhibits superior cross-temporal generalisation, revealing dynamic representational transformations. In an attentional priority task, HeteroRC uncovers statistically learned spatial priority information that remains hidden from conventional methods, successfully decoding these latent states previously thought to be activity-silent. Furthermore, we develop a dual-level interpretability framework linking reservoir dynamics to virtual sources and sensor space, revealing the temporal, spectral, and spatial signatures underlying decoding performance at both the individual and group levels. Together, HeteroRC offers an interpretable approach to decode information from dynamic neural responses, broadening the analytical scope of neural decoding while remaining computationally efficient and free from manual feature engineering, making it particularly suitable for small-sample electrophysiological studies.
Reddy, S. T.
Show abstract
The softmax attention mechanism in transformer architectures (Vaswani et al., 2017) is mathematically identical to the Boltzmann distribution governing molecular binding at thermal equilibrium (Boltzmann, 1877). Luces Choice Axiom (1959) establishes this function - which we term the convergence equation - as the unique function satisfying five axioms of competitive selection: positivity, normalization, unrestricted domain, rank preservation, and independence of irrelevant alternatives. We show that five additional architecture conditions - discrete intermolecular contacts, bilinear energy decomposition, finite competitor pools, thermal equilibrium, and stochastic selection - are satisfied by at least ten biological molecular recognition systems and together prescribe a complete neural architecture: dual encoders, cross-attention, InfoNCE contrastive training, symmetric loss, learned temperature, and cross-attentive decoder. We term this architecture a Specificity Foundation Model (SFM) and specify it for antibody-antigen, TCR-peptide-MHC, transcription factor-DNA, microRNA-mRNA, enzyme-substrate, CRISPR guide RNA-DNA, drug-target, peptide-MHC, receptor-ligand, and RNA-binding protein-RNA recognition. The first implementation (CALM; Lee et al., 2026) achieves antibody-antigen retrieval from approximately 4,000 training pairs with [~]100,000-fold greater data efficiency than comparable contrastive architectures trained without the physics derivation. We classify this as Level 3 architecture-physics alignment and derive three further theoretical results: an exponential scaling law for retrieval accuracy as a function of training data diversity (the MRC scaling law), a two-parameter affinity calibration framework connecting contrastive scores to binding free energies, and a hybrid recursive learning framework for cross-modal reinforcement learning with orthogonal verification. The failure conditions of the framework are analyzed in terms of the validity of equilibrium thermodynamics for molecular binding and the convergence properties of gradient-based parameter estimation.
Cheng, K.; Liu, Y.; Nie, Z.; Lin, M.; Hou, Y.; Tao, Y.; Liu, C.; Chen, J.; Mao, Y.; Tian, Y.
Show abstract
Understanding the structural dynamics of biomolecules is crucial for uncovering biological functions. As molecular dynamics (MD) simulation data becomes more available, deep generative models have been developed to synthesize realistic MD trajectories. However, existing methods produce fixed-length trajectories by jointly denoising high-dimensional spatiotemporal representations, which conflicts with MDs frame-by-frame integration process and fails to capture time-dependent conformational diversity. Inspired by MDs sequential nature, we introduce a new probabilistic autoregressive (ProAR) framework for trajectory generation. ProAR uses a dual-network system that models each frame as a multivariate Gaussian distribution and employs an anti-drifting sampling strategy to reduce cumulative errors. This approach captures conformational uncertainty and time-coupled structural changes while allowing flexible generation of trajectories of arbitrary length. Experiments on ATLAS, a large-scale protein MD dataset, demonstrate that for long trajectory generation, our model achieves a 7.5% reduction in reconstruction RMSE and an average 25.8% improvement in conformation change accuracy compared to previous state-of-the-art methods. For conformation sampling task, it performs comparably to specialized time-independent models, providing a flexible and dependable alternative to standard MD simulations.
Qiao, Y.; Ma, Z.
Show abstract
Gut microbiome studies in Parkinsons disease (PD) are challenged by high dimensionality, sparsity, compositionality, and substantial between-cohort heterogeneity, all of which complicate robust community typing and disease-status classification. Here, we developed a variational autoencoder (VAE)-based methodology for deep enterotyping and PD diagnosis prediction (i.e., predicting diseased vs. control status) using a harmonized multi-cohort gut microbiome compendium comprising 1,957 16S rRNA samples from six PD case-control cohorts and an independent shotgun metagenomic validation cohort of 725 samples. Compared with conventional enterotyping approaches such as partitioning around medoids (PAM) and Dirichlet multinomial mixture (DMM) modelling, the VAE-derived latent space supported a clearer and more reproducible three-cluster solution. These three enterotype-like community states were biologically interpretable and were annotated as Enterococcus-type, Bacteroides-type, and Ruminococcus-type configurations. The same broad three-enterotype structure was independently recapitulated in the metagenomic dataset, supporting cross-platform robustness. Across the three inferred types, the proportion of PD samples was similar, and both the primary generalized linear mixed-effects model and sensitivity model showed that enterotype assignment was not a significant differentiating factor for PD status and that the lack of association was not dependent on a single modelling strategy. In the supervised branch, VAE-derived representations supported PD case-control classification while also providing a shared latent representation for clustering, enterotype transfer, and downstream interpretation. Collectively, these findings show that deep representation learning can improve the resolution, reproducibility, and interpretability of enterotype inference in heterogeneous microbiome datasets, and provide a practical methodology for organizing broad community structure in PD. In this setting, the main advantage of the VAE method lies in its ability to link unsupervised community typing with supervised prediction through a shared latent representation, even when broad community types do not function as stand-alone disease biomarkers.
Petersen, M.; Patil, K. R.; Eickhoff, S. B.; Biessels, G. J.; Meta VCI Map Consortium,
Show abstract
Because many neurobehavioural functions rely on distributed brain networks, anatomically diverse brain lesions can cause the same neurobehavioural deficit. Lesion network mapping (LNM) builds on this principle to better understand the functional anatomy of the brain, by mapping focal lesions onto a normative functional connectome. Yet, recent work raised concerns about anatomical specificity of LNM, showing that commonly used LNM procedures converge on nonspecific connectome properties. Here, we show that anatomically specific LNM is possible with the right statistical approach using symptom-label permutation as a null model. We demonstrate this in a multicenter dataset of 2,950 stroke patients across 12 cohorts, comparing patients with and without impairment in 6 cognitive domains. First, we showed that permutation-based LNM yielded distinct and biologically plausible network maps with modest cross-cognitive domain similarity. Second, we replicated the previously raised concern of nonspecific connectome-driven convergence when using parametric statistics. Third, we assessed specificity of our approach through simulation analyses across 10,000 null studies which confirmed that the permutation framework maintained valid type I error control. These findings demonstrate that permutation-based null models preserve anatomical specificity in LNM, enabling the identification of brain networks that are genuinely linked to distinct neurobehavioural functions. This approach may thus allow researchers to more reliably map the network basis of neurobehavioural deficits from focal brain lesions.
Jackson, N. J.; Yan, C.; Caro-Vega, Y.; Paredes, F.; Ismerio Moreira, R.; Cadet, S.; Varela, D.; Cesar, C.; Duda, S. N.; Shepherd, B. E.; Malin, B. A.
Show abstract
Digital health technologies, including machine learning (ML), are transforming infectious disease management, however ML models for HIV care have been limited by data sharing restrictions that prevent multi-site collaboration. Federated Learning (FL) offers a privacy-preserving solution, enabling cross-site model training without sharing patient-level data. We evaluated FL for developing clinical prediction models using data from 22,234 people living with HIV (PLWH) across six sites in five countries within the Caribbean, Central, and South America network for HIV epidemiology (CCASAnet). Across four prediction tasks --- 1-year mortality, 3-year mortality, tuberculosis incidence, and AIDS-defining cancer incidence --- FL algorithms achieved near-centralized performance while substantially outperforming site-specific models. Performance gains varied across sites, driven by both site size and between-site heterogeneity. Local fine-tuning often improved FL performance, though benefits were task dependent. These findings support FL as a scalable, privacy-preserving infrastructure for multi-site ML in international HIV research.
Sabharwal, A.; Patel, M. S.; Carrano, A.; Rotman, M.; Wierson, W.; Ekker, S. C.
Show abstract
The deployment of large language models (LLMs) for science carries an intrinsic risk: hallucination of citations, fabricated drug approvals or clinical trials, and unsupported experimental outcomes. Here we describe the testing and deployment of a novel systematic, multi-layer approach called the Validation as a System (VaaS) pipeline, iteratively developed during the construction of an open-source, living Rare Disease Database (RDD). We report lessons learned and production results from 225 carefully annotated rare disease gene curations and a prospective 100-gene collection (99 net new), together representing over 3,000 verified citations. After three iterations of directed refinement, the net functional hallucination rate approached zero. We validated the pipeline using three complementary benchmarks: (1) VaaS-RIKER2, a 640-run prospective ablation study (4 conditions x 4 temperatures x 40 genes) plus 117 open-weight model runs on dedicated GPU hardware - unguided LLM output produced 95.9% Type II hallucination (wrong-topic citations that exist as real papers but carry a correct claim context yet do not support the cited claim); the full VaaS protocol achieved 0.0% Type I and 6.5% Type II, a >14-fold reduction; live PMID verification alone (C3) eliminated both error types entirely (0.0%/0.0%); (2) an independent L3 citation audit of Wave 3 (179 PMIDs, 99.4% valid, 0 Type I errors); and (3) the MedHallu clinical hallucination benchmark, on which the VaaS protocol achieved F1 = 0.9853 on the hard tier (cases where all benchmark ensemble models were fooled), compared to the published GPT-4o baseline of F1 = 0.811 (Pandit et al., 2025). Three independent open-weight models (llama3.2, qwen2.5:14b,mistral:7b) showed 81-87% Type II rates under unguided conditions, confirming that wrong-topic citation hallucination is structural and model-agnostic. In contrast, the corresponding VaaS rate was measured to be zero (n = 508 verified citations; 160 runs, C4 full protocol) under the same conditions. Human validation of [≥]50 entries confirmed zero Type I errors and less than 0.5% Type II errors in the manual curation test. The VaaS pipeline operated at less than [~]$1 overall per comprehensive gene review, demonstrating that citation-integrity standards in AI-assisted biomedical synthesis are achievable at production scale. The VaaS approach represents, to the authors' knowledge, the lowest measured hallucination system for science to date and is set to further accelerate the use of AI and AI agents for advancing research.
Grigas, A. T.; Sumner, J.; O'Hern, C. S.
Show abstract
Protein structure is controlled by a high-dimensional energy landscape, which is a function of all of the atomic coordinates of the protein. Can this landscape be accurately described by a low-dimensional representation? We find that residue core identity, a binary N-dimensional encoding indicating whether each of the N amino acids in a protein is buried in the core or not, can predict the proteins backbone conformation more efficiently than all other representations that we tested. Core identity is 4 times more efficient than previous estimates of the bits per residue needed to encode a proteins native fold, 2 times more efficient than the C contact map, and 1.5 times more efficient than the machine-learned embeddings from FoldSeeks 3Di. Even when the folded structure is unavailable, predicting each residues burial from sequence yields a more accurate estimate of fold quality than predicting pairwise contacts from the same sequence information. Thus, this work emphasizes that the problem of determining a proteins native fold can be re-framed as predicting each residues core identity.
Ye, C.; Liao, J.; Yin, Z.; Li, Y.; Xu, Y.; Fan, H.; Ma, T.; Zhang, J.
Show abstract
Sleep disturbances are pervasive, debilitating non-motor symptoms of Parkinson's disease (PD), where sleep spindle deficits directly drive cognitive decline and disease progression. Current adaptive deep brain stimulation (aDBS) for PD is largely limited to motor symptom management, with no established technical foundation for sleep spindle-targeted closed-loop modulation. The functional role of the basal ganglia in human sleep spindle regulation remains incompletely characterized, and no robust cross-subject pipeline exists to decode these transient events from clinically implanted DBS electrodes. Here, we developed a connectomics-guided meta-learning framework for cross-subject sleep spindle decoding and anticipatory prediction, using whole-night synchronized basal ganglia local field potential and polysomnography data from 17 PD patients with bilateral DBS implants. Our framework achieved 92.63% accuracy for concurrent spindle decoding and 83.44% accuracy for 2-second-ahead prediction, with optimal signals localized to the limbic subthalamic nucleus and <50 ms total latency meeting real-time closed-loop requirements. This work defines the neuroanatomical substrate of basal ganglia spindle signaling in PD, establishes the cross-subject spindle decoding pipeline for clinical DBS systems, and provides a critical translational foundation for sleep-targeted closed-loop aDBS to mitigate PD non-motor burden.
Gorenshtein, A.; Sorka, M.; Omar, M.; Miron, K.; Hatav, A.; Barash, Y.; Klang, E.; Shelly, S.
Show abstract
Most clinical large language model (LLM) benchmarks rely on clean, concise vignettes that do not reflect the noisy, long-form documentation typical of real clinical records. How LLM performance degrades under realistic chart conditions remains poorly characterised. Here we test whether structured retrieval workflows protect National Institutes of Health Stroke Scale (NIHSS) scoring accuracy under systematic context stress. Using 100 de-identified acute stroke cases and a fully crossed 4 x 4 x 3 x 3 condition matrix (144 conditions per case), we vary context acquisition method, document length, distractor load and critical-information position across four Gemini models (57,047 retained runs). Structured retrieval reduces mean absolute error (MAE) from 4.58 to 2.96 points relative to non-agentic baselines (mean gain 1.62 MAE points; 95% CI 1.57 to 1.67; 35% relative reduction), with consistent gains across all 36 stress combinations. Lower-cost models show disproportionately larger gains (2.76 versus 0.45 MAE points). Tool-retrieved pipelines outperform retrieval-augmented generation in 33 of 36 combinations. These findings indicate that retrieval architecture, rather than model scale alone, is a tractable lever for robust, equitable clinical LLM deployment.
Hakata, Y.; Oikawa, M.; Fujisawa, S.
Show abstract
Background. Federated learning (FL) enables collaborative model training across institutions without sharing patient-level data. However, standard FL algorithms such as FedAvg degrade under non-independently and non-identically distributed (non-IID) data, a prevalent condition when patient demographics, scanner hardware, and disease prevalence differ across hospital sites. Objective. We propose iPS-MFFL (Individualized Per-Site Meta-Federated Feature Learning), a federated framework with a hierarchical local-model architecture that addresses non-IID heterogeneity through (1) a shared feature extractor, (2) multiple weak-learner classification heads that can be trained with heterogeneous training objectives to promote complementary decision boundaries, (3) independent per-learner server aggregation so that each weak learner's parameters are averaged only with its counterparts at other clients, and (4) a lightweight meta-model, itself federated, that adaptively stacks the weak-learner outputs. Methods. We evaluate on the Brain Tumor MRI Classification dataset (7,200 images; 4 classes: glioma, meningioma, pituitary tumor, no tumor) partitioned across K = 5 simulated hospital sites using Dirichlet non-IID sampling (alpha = 0.3). Four baselines are compared: Local-only training, FedAvg, FedProx, and Freeze-FT. All experiments are repeated over three random seeds (13, 42, 2025) and evaluated using paired t-tests, Cohen's d effect sizes, and post-hoc power analysis.